Formação de gentílicos a partir de topônimos: descrição linguística e aprendizado automático (Formation of Demonyms from Toponyms: Linguistic Description and Machine Learning)[In Portuguese]

نویسندگان

  • Roger Alfredo Marci Rodrigues Antunes
  • Thiago Pardo
  • Gladis Maria Barcelos Almeida
چکیده

Resumo. O presente artigo tem como objetivo descrever as regras envolvidas na transformação de topônimos em gentílicos, de modo a identificar regularidades. A partir dessas regularidades, desenvolve-se um algoritmo capaz de gerar gentílicos de forma automática. Como base teórica, são considerados conceitos da Morfologia Derivacional e, do ponto de vista metodológico, toma-se como fonte topônimos e gentílicos do Instituto Brasileiro de Geografia e Estatística (IBGE), bem como se criam procedimentos para tornarem os dados manipuláveis. Realiza-se também um processo complementar de aprendizado automático. Como resultado, obtém-se boa acurácia na predição de gentílicos, revelando regras e atributos novos e relevantes​ ​para​ ​a​ ​tarefa.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Desambiguação de Homógrafos-Heterófonos por Aprendizado de Máquina em Português Brasileiro (A Machine Learning Approach for Homographic Heterophone Disambiguation in Brazilian Portuguese)

To improve the quality of the speech produced by a text-to-speech system, it is important to obtain the maximum amount of information from the input text that may help in this task. In this context, the word sense disambiguation plays an important role and still be a central problem for natural language processing applications. This paper proposes to model the ambiguity of words as a supervised...

متن کامل

Uma abordagem de classificação automática para Tipo de Pergunta e Tipo de Resposta (An Automatic Approach for Classification of Question Type and Answer Type) [in Portuguese]

The question type classification and answer type classification are very important tasks for Question Answer Systems. This paper presents an automatic approach using machine learning for these tasks. We used decision trees as machine learning algorithm and 14 features developed using a tagger and a named entity systems. Resumo. A classificação de tipos de pergunta e tipo de resposta são tarefas...

متن کامل

Natural language processing for social inclusion: a text simplification architecture for different literacy levels

Text simplification is a research area of Natural Language Processing, whose goal is to maximize text comprehension through simplification of its linguistic structure. This paper presents our approach for Brazilian Portuguese text simplification. As people have different literacy levels, we take that into account when generating simplified texts. We propose an architecture for text simplificati...

متن کامل

Sistema de Aquisição Semi-Automática de Ontologias

This paper presents an ongoing work on ontology learning from text, focusing on the acquisition of concepts and relations. In order to do that, this work investigates approaches for ontology learning, and presents a proposal based on graphs metrics to identify concepts, and text analysis to find relations between the concepts. Resumo. Este artigo apresenta um trabalho em andamento na área de ap...

متن کامل

Text Simplification as Tree Transduction

Lexical and syntactic simplification aim to make texts more accessible to certain audiences. Syntactic simplification uses either hand-crafted linguistic rules for deep syntactic transformations, or machine learning techniques to model simpler transformations. Lexical simplification performs a lookup for synonyms followed by context and/or frequency-based models. In this paper we investigate mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017